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with  one  suggested  by  Owen,  Chen,  and  Li  but  has  the  advantage  that  no  special 
tables  are  needed. 


A  Selection  Procedure  Using  a  Screening  Variate 

Richard  W.  Madsen* 

University  of  Missouri  -  Columbia 

1.  Introduction  and  Background 

Consider  N  objects  on  which  two  correlated 
measurements  X  and  Y  can  be  made.  Assume  that  the 
probability  that  the  Y  measurement  meets  a  certain 
specification  (e.g.  Y  <.  u)  is  Y  .  We  present  a 
method  whereby  a  maximal  subset  of  m  out  of  the  N 
objects  can  be  chosen,  based  on  the  observed  X 
measurements,  so  that  there  is  a  high  probability  (O 
that  a  large  proportion  (II  >  y)  of  the  selected  sub¬ 
set  will  meet  the  desired  specification  related  to  the 
Y  measurement.  In  general  such  a  selection  procedure 
would  be  used  when  Y  is  based  on  a  measurement  which 
is  difficult  or  expensive  to  make  and  X  is  based  on 
one  which  is  easier  or  less  expensive  to  make.  For 
example  measuring  Y  may  actually  destroy  the  item  being 
tested  whereas  measuring  X  will  not.  In  another  situa¬ 
tion  Y  might  be  a  student's  grade  point  average  after 

^Research  supported  in  part  by  the  Office  of 
Naval  Research,  Contract  0NR-N00014-76-C-0789. 


a  number  of  years  in  college,  while  X  is  a  score  made 
on  an  entrance  or  qualifying  examination. 

A  related  problem  considers  an  infinite  population 
where  we  assume  that  a  certain  proportion,  say  Y  ,  of  the 
observed  Y  variates  satisfy  some  specification.  By 
screening  on  the  observed  value  of  the  correlated  variable 
X  ,  it  may  be  possible  to  raise  the  proportion  of  Y 
variates  which  satisfy  the  specification  to  a  higher 
value,  say  6  .  We  will  assume,  as  is  generally  done, 
that  X  and  Y  have  a  bivariate  normal  distribution  with 
correlation  coefficient  P  . 

This  problem  has  been  studied  for  quite  some  time 
with  the  work  of  Taylor  and  Russell  [8]  in  1939  being 
among  the  earliest.  More  recently  D.  B.  Owen  and  various 
co-researchers  ([3] ,  [4],  [7],  and  [9]) have  studied  other 
aspects  of  this  problem.  For  example,  Thomas,  Owen,  and 
Gunst  [9]  considered  two  screening  variables  X^  and  X2 
Li  and  Owen  [3]  considered  two  sided  screening  procedures; 
Owen  and  Boddie  [4]  considered  screening  methods  with  some 
parameters  unknown.  In  these  cases,  sharp  cut-off  scores 
are  found  such  that  if  the  X  score  is  in  a  given  range, 
say  X  <  yx  ♦  k$x  ,  the  corresponding  item  is  selected. 

Much  of  the  work  done  in  this  area  has  been  to 
table  values  of  k  corresponding  to  values  of  Y  ,  p  , 
etc,  to  meet  certain  specifications.  Hence  one  potential 
deterrent  to  implementation  of  these  screening  procedures 


is  the  need  for  specialized  tables.  A  second  point  to 
consider  is  that  with  the  usual  procedures  the  precise 
value  of  X  is  not  used,  rather  only  the  fact  that  X 
is  above  or  below  a  given  cut-off  score  is  used.  The 
procedure  presented  here  has  the  advantage  of  not  needing 
special  tables  (other  than  standard  normal  tables) .  It 
also  makes  use  of  the  precise  observed  value  of  X  ,  not 
simply  whether  or  not  the  score  is  above  a  given  cut-off. 
We  assume  that  there  are  a  finite  number  N  of  items 
available  for  screening. 


2.  The  Selection  Procedure 


Consider  a  finite  collection  of  objects ,  say  N 
objects,  on  which  it  is  possible  to  make  measurements  X 
and  Y  which  come  from  a  bivariate  normal  distribution 
with  correlation  coefficient  p  >  0  .  Assume  that  an 
item  is  acceptable  if  Y  <.  u  and  that  the  overall  propor¬ 
tion  of  such  acceptable  items  is  to  be  raised  from  y 
(before  screening)  to  6  (after  screening) .  Following 
the  procedure  of  Owen,  Chen,  and  Li  [5],  we  might  find  a 
value  k  such  that  an  item  is  selected  if  X  £  ux  ♦  k . 
The  value  k  ,  of  course,  is  a  function  of  the  parameters. 
For  this  value  of  k  , 

P[Y  <L  u  |  X  <  wx  +  kox]  -  6  .  (1) 

While  in  an  exceedingly  large  population  (which  we  might 
take  to  be  "infinite") ,  the  proportion  of  selected  items 
which  are  accepted  will  be  5  ,  in  a  finite  set  of 
selected  items  the  actual  proportion  of  acceptable  items 
will  be  a  random  variable.  Specifically,  if  m  items 
are  selected,  then  the  actual  number  of  those  items  for 
which  Y  £  u  ,  will  be  a  binomial  random  variable,  say 
V  ,  with  parameters  m  and  5  .  If  we  want  the  proportion 
of  acceptable  items  in  the  finite  set  of  selected  items 
to  be  at  least  n  with  probability  at  least  c  » 


i.e.  if  we  want 


P[V  >  Jim]  -  E  [?|«^(1  -  6)m^  >  C  .  (2) 

j-1  VI 

where  £  ■  £(m)  is  the  smallest  integer  greater  than 
or  equal  to  nm  ,  then  6  must  be  chosen  suitably  large. 

By  using  the  interrelationships  among  the  binomial,  beta, 
and  F  distributions  it  can  be  shown  that  a  suitable 
choice  for  5  is  given  by 

6  -  £/[£  ♦  Cm  -  £  ♦  1)F5f2m-2£+2,2£1 

where  a>b  is  the  (1  -  O  •  1001  upper  tail 

percentage  point  for  an  F  distribution  having  a  and  b 
degrees  of  freedom  for  the  numerator  and  denominator. 

This  value  of  6  is  then  used  in  (1)  and  the  value  of  k 
to  satisfy  the  equality  in  (1)  can  be  found  by  using  tables 
given  in  Owen,  Mclntire,  and  Seymour  [6],  Note  that  the 
value  of  m  must  be  specified  in  advance  and  hence  is  a 
fixed  quantity. 

In  the  procedure  we  propose,  we  assume  that  a  large 
lot  of  N  items  is  available  for  screening..  The  values  of 
X  ,  call  them  X^,  X£,  •••  ,  Xjj  are  found  for  each  item. 
Using  the  conditional  distribution  of  Y  given  X  »  x  , 
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calculate  by 

Pj  ■  P(Y  <  u  |  XA  -  x J 

■  *[(u  -  py  -  -  Ux))/(cy/l  -  p2)] 

X 

where  *[•]  represents  the  CDF  o£  a  standard  normal 

random  variable.  Since  we  assume  the  parameters  of  the 

bivariate  normal  distribution  are  known,  we  can,  wlog, 

2  2 

take  them  to  be  px  *  py  *  ®  »  °X  “  °Y  “  1  *  *n  ^is 
case  we  have 

Pi  «  *[(u  -  px^/Zl  -  pZ]  .  (3) 

By  first  ordering  the  x-^s  ,  we  can  assume  that 
Pi  —  ?2  —  *  *  *  —  Pn  *  Now  ^or  eac^  value  of  m  ,  define 
t(m)  to  be  the  smallest  integer  greater  than  or  equal 
to  II  *  m  .  The  selection  procedure  is  to  select  the 
m*  items  having  the  largest  Pi  values,  where  m*  is 
the  largest  integer  satisfying  a  relationship  like  (2), 
namely 


P[V(m*)  >  *(m*)]  >  c  .  *4) 

In  so  doing  we  select  as  many  items  as  possible  subject 
to  satisfying  the  constraint  given  in  (4).  (Note  that 
this  kind  of  situation  might  be  desirable  for  a  manufac¬ 
turer  who  produces  lots  of  N  items  and  wishes  to  sell 
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a  sub-lot  of  size  m*  ,  as  large  as  possible,  such  that 
a  proportion  n  or  more  (say  a  guaranteed  proportion) 
of  the  screened  items  are  satisfactory  with  probability 
at  least  c  .) 

In  order  to  calculate  the  probability  in  (4),  it 
is  necessary  to  note  that  since  we  first  order  the  X 
values  and  then  consider  the  conditional  distribution 
of  the  corresponding  Y  values  given  the  X*s  ,  we  are 
dealing  with  what  are  known  as  the  concomitants  of  order 
statistics.  (See  David,  O'Connell,  and  Yang  [2].)  It 
follows  from  Bhattacharya’ s  work  [1]  that  the  Y^  values, 
conditional  on  the  ordered  X^^  values,  are  independent 
with  conditional  distributions  which  are  normal  with 


pXi  °Y|x^  -  1  -  p2  .  Now  let  the  x^  values 


pY|x.  *  pxi  0 

be  given  and  define 


f 1  if  Y.  <  u 

"i  ■  { 

l 0  otherwise  . 


then  the  are  (conditionally)  independent  Bernoulli 

random  variables  and  P(W^  ■  1)  ■  p^  .  Consequently 


P[V(m)  >  Um)]  -  P[  E  W.  >  *(m)] 

i-1  1 


Zm  a.  , 


v%i Of 


(5) 
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where  the  sum  is  taken  over  all  vectors 
a  -  (oj,  a2,  •••  ,  am)  satisfying  »  0  or  1  , 

i(m)  .  We  then  take  m*  to  be  the  largest  value 
of  m  such  that  c  >  ?  . 

m 

In  trying  to  finj.  m*  ,  one  could  systematically 
calculate  cm  for  m"N,N-l,N-2,  •••  stopping 
as  soon  as  c  i  c  .  However  it  is  not  necessary  to 
check  all  values  of  m  because  some  values  are  inadmis- 
sible.  Specifically,  if  £(m)  ■  £(m  +  1)  ■  ,  then  the 
value  m  is  inadmissible.  (For  example  if  n  *  .8  , 
then  £(4)  ■  (smallest  integer  _>  (.8)  (4)  ■  3.2)  -  4 
and  £ (5)  -  4  .  Since 

P[V(4)  >  £(4)  -  4]  <  P[V(5)  >  £(5)  -  4]  , 

it  follows  that  C  4  i.  C  5  •  We  would  never  take  m*  to 

be  4  since  if  C  4  c  »  it  must  also  be  true  that 

C5  >,  C  •  Hence  we  say  that  when  n  *  .8  ,  the  value 
m  ■  4  is  inadmissible.)  Since  N  is  the  size  of  the 

lot,  take  N  to  be  admissible.  Since  the  p^  are 

in  decreasing  order,  it  might  appear  that  for  admissible 
m's  ,  the  quantities  are  strictly  decreasing.  How¬ 

ever  because  of  the  rounding  upwards  that  is  done  in 
calculating  £(m)  ,  this  need  not  be  the  case.  Conse¬ 
quently  by  following  the  given  algorithm  we  can  be  sure 
to  find  m*  . 


v'Vi*  Swats*. 
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(1)  Place  the  observed  x^  in  increasing  order  and 
relabel  the  x's  so  that  —  x2  —  * ’ *  —  XN  * 

(2)  Find  *  p [Yi  £  u  |  ^  -  x.^ ] 

•  *[u  -  pxi)//l  -  p*]  ,  i  -  1,  2,  •••  N 

(3)  Find  the  admissible 

m^  ,  rai  <  m2  <  nij  <  •  •  •  <  me  »  N  . 


(4)  Set  j  *  e  and  find  c  by  using  equation  (5)  . 

m. 

(5)  If  c  >  C  ,  set  m*  =  m.  .  Otherwise  reduce 

®j  “  3 

j  by  1  and  calculate  the  next  c 

111  • 

3 

Note  that  while  the  sequence  of  {£  }  is  not  strictly 

“j 

decreasing  empirical  studies  indicate  that  the  size  of  any 
increase  in  successive  terms  is  quite  small  relative  to  the 
typical  amount  of  decrease.  From  a  practical  viewpoint 
then,  one  might  use  a  different  algorithm.  For  instance 


one  might  choose  a  middle  admissible  nu  and  increase  or 

decrease  j  depending  on  the  value  of  e 

If  m.  is  sufficiently  large,  the  value  of  C 

3  mj 

can  be  approximated  by  using  a  normal  distribution.  The 


use  of  this  approximation  can  be  justified  by  using  a 
central  limit  theorem  for  independent  but  not  identically 
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distributed  random  variables.  In  particular 


P[V(m.)  >  i(m.)J 
*  1  -  *[(*(m.) 


.5 


3  3  1/z 
z  Pi)/(  z  p.(i  -  p.))  ] 
i-1  1  i-1  1  1 


(6) 


If  N  is  relatively  small  so  that  c  is  to  be 

m 

found  exactly  by  using  (5)  rather  than  being  approximated 
by  a  normal  distribution,  the  calculations  can  be  quite 
tedious.  One  possible  means  of  eliminating  some  of  the 
calculations  is  to  use  Chbychev's  inequality.  In  partic¬ 
ular  we  have 


in  m  1 /7 

ECV(m))  -  ZPi  ,  oy(m)  -  (Sp^i)  /Z 


so  if  (Ilm  -  Zp.)  >  0  ,  then 
1  1 

m 

P[V(m)  ^  nm]  =  P[V(m)  -  E(V(m))  >  (nm  -  Zp.)] 

1  1 

m  , 

<  P[  |  V(m)  -  E(V(m) )  !  >  (Hm  -  Zp.)]  <  K  , 

1  1  K 


where 


m 

k  -  tnm  -  EPi)/ov(m) 
P[V(m)  >_  nm]  <  c 


It  follows  that 


m 

1  ?pi,i 

provided  that  (  /^2)  <  5  »  i*e«  - ^ - ”  <  5  •  (7) 

(nm  -  zPi)z 
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Consequently  for  any  admissible  value  of  m  for  which 
the  inequality  in  (7)  holds,  the  value  will  be  less 

than  £  ,  hence  need  not  be  calculated  explicitly.  If  a 
normal  approximation  is  to  be  used,  the  computations  are 
quite  simple  and  shortcut  methods  are  not  quite  so 
necessary. 
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3.  An  Example 

In  this  example  we  take  N  ■  10  .  The  data  shown 

in  Table  1  was  generated  from  a  bivariate  normal  distribu- 

2  2 

tion  with  ■  Uy  *  0  ,  °x  "  °Y  "  ^  *  and  p  *  *^0  * 
Table  1.  Data  for  Example  1. 


xi 

*i 

Pi 

-1.8772 

-1.3569 

.9995 

-  .8058 

-  .7349 

.8606 

-  .6222 

-  .8524 

.7592 

-  4457 

-  .0962 

.6327 

-  .0152 

-  .8580 

.2912 

.3443 

.5514 

.0982 

.5310 

.5349 

.0468 

.5431 

-  .0648 

.0444 

1.2019 

1.6161 

.0011 

1.7573 

1.3359 

.0000 

For  convenience  the  x  values  have  been  placed  in  increas¬ 
ing  order.  The  y  values  correspond  to  the  appropriate 
x's  .  (That  is  the  (x,y)  pairs  are  ordered  by  the  first 

element.)  We  will  take  y  ■  .4  ,  II  *  .6  ,  and  c  *  -90  . 

That  is  in  the  unscreened  population  Y  *  40%  of  the  items 

are  acceptable.  We  wish  to  choose  a  subset  of  the  N  •  10 

items  available  such  that  at  least  H  ■  60%  of  the  items 

in  the  screened  subset  are  acceptable  with  probability 

C  »  .90  . 

From  this  information  we  find  the  value  of  u  .  Since 
P[Y  <  u]  -  .40  -  y 
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and  since  Y  has  a  standard  normal  distribution,  it 
follows  that  u  ■  -.2533  .  Following  the  steps  of  the 
algorithm,  we  next  find  the  values 

Pi  -  *[(u  -  px.)/(l  -  p2)1/2] 

-  *[(-.2533  -  .90x.)/. 43591  . 

These  values  are  shown  in  the  third  column  of  Table  1. 

Next  find  the  admissible  m's  .  With  II  «  .6  , 
the  admissible  values  of  m  are  1,  3,  5,  6,  8,  and  10  . 
Direct  calculations  show  that  for  m  ■  8  and  10  ,  the 
inequality  (7)  holds,  so  and  are  both  less 

than  .90  .  Using  (5)  we  find 

C6  -  .5751  ,  -  .8909  ,  Cj  -  .9663 

so  we  would  take  m*  ■  .3  .  From  Table  1  it  can  be  seen 
that  for  the  top  three  x  values,  each  of  the  correspond¬ 
ing  y's  turned  out  to  be  acceptable  (i.e.  y^  i  u  •  -.2533) 
In  this  sample  all  screened  items  happened  to  be  satisfac¬ 
tory.  In  general,  by  following  this  procedure,  at  least 
60%  of  the  screened  sample  would  be  acceptable  at  least 
90%  of  the  time. 
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4.  Comparison  with  Another  Procedure 

The  procedure  that  we  have  proposed  for  screening 

is  most  similar  to  the  one  discussed  in  Owen,  Chen,  and 

Li  [5].  For  convenience  we  will  refer  to  their  procedure 

as  the  OCL  procedure  and  will  refer  to  ours  as  the  Sigma 

procedure.  The  OCL  procedure  is  to  find  a  single  cutoff 

score  kQ  so  that  any  item  having  an  X  score  below 

kQ  is  accepted  In  order  to  find  the  value  of  kQ  from 

tables  it  is  necessary  to  know  the  value  of  m  ,  the 

o 

total  number  of  items  to  be  accepted.  This  implies  then 
that  the  number  of  items  to  be  accepted  is  determined 
before  inspection  starts.  It  would  be  logical  then  to 
inspect  the  items  one  at  a  time,  say  as  they  become  avail¬ 
able.  The  inspection  process  would  terminate  when  mQ 
items  have  been  accepted.  One  advantage  of  such  a 
procedure  is  that  it  is  immediately  known  whether  or  not 
an  item  is  to  be  accepted  or  rejected.  Of  course  it  is 
possible  that  the  pool  of  items  which  are  being  inspected 
is  too  small  to  be  able  to  find  mQ  acceptable  items,  in 
which  case  new  screening  criteria  must  be  set  forth  result 
ing  in  a  new  cutoff  score,  etc. 

With  the  Sigma  procedure,  all  of  N  items  avail¬ 
able  are  inspected  and  the  number  which  will  ultimately  be 
accepted,  say  M  ,  is  a  random  variable.  Likewise  the 
cutoff  score  is  a  random  variable,  say  K  .  Since  the 
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value  of  K  is  not  known  until  all  N  items  have  been 
inspected,  even  though  the  value  of  X  and  the  corres¬ 
ponding  value  of  p  are  known  for  a  particular  item, 
it  may  not  be  immediately  known  whether  or  not  that  item 
will  ultimately  be  accepted.  (In  some  cases,  however, 
if  the  value  of  p  is  high  enough,  one  can  be  virtually 
certain  that  the  item  will  be  acceptable.  See  Appendix  1.) 

Monte  Carlo  studies  were  performed  to  compare  the 
OCL  and  Sigma  procedures  quantitatively.  Since  there  are 
some  qualitative  differences  (e.g.  in  the  OCL  method  mQ 
is  fixed  while  in  the  Sigma  method  M  is  random) ,  some 
reasonable  basis  for  comparison  had  to  be  made.  We  took 
mQ  to  be  100  for  the  OCL  method  so  that  items  were 
screened  sequentially  until  100  acceptable  ones  were 
found.  We  denote  the  random  number  which  had  to  be 
screened  by  NqcL  .  In  the  Sigma  method  N  was  determined 
empirically  so  that,  in  500  Monte  Carlo  trials,  the 
sample  average  value  of  M  was  also  approximately  100  . 

(we  were  satisfied  if  m  was  within  100  t  1.)  Some  results 
of  the  Monte  Carlo  studies  are  shown  in  Table  2.  There  are 
four  comparisons  that  can  be  made  here: 

(1)  Observed  sample  proportion  of  satisfactory  items. 

(2)  Estimated  variance  of  proportion  of  satisfactory 
items . 

(3)  Estimated  average  number  screened  to  get  100 
acceptable. 

(4)  Observed  number  of  times  that  V  j>  Ilm  . 
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The  first  and  third  of  these  are  probably  of  greatest 
interest.  In  the  first  case,  in  order  to  have  a  high 
probability  that  V  >  Iln  ,  the  actual  proportion  of 
satisfactory  items  must  exceed  n  .  However  it  is 
advantageous  to  a  manufacturer,  for  example,  to  exceed 
II  by  as  little  as  possible.  In  almost  every  case  the 
observed  proportion  is  closer  to  n  for  the  Sigma 
procedure,  especially  when  n  «  y  .  To  compare  average 
numbers  screened  to  get  100  acceptable  items,  we  used 
an  empirical  determination  of  N  so  that  m  was  within 
100  t  1  .  Then  the  estimated  average  number  under  the 
Sigma  procedure  was  taken  to  be  lOON/m  .  This  number 
is  compared  with  ,  the  average  number  screened 

under  the  OCL  procedure  to  get  100  acceptable  items. 

In  22  out  of  28  cases  investigated  the  average  is 
smaller  for  the  Sigma  procedure. 

In  500  trials,  the  expected  number  of  times  that 
V  will  be  at  least  II  •  m  should  be  500  *  ?  with  a 
variance  of  500? (1  -  ?)  .  There  are  some  situations 
where  the  observed  number  is  higher  than  this  expected 
number  for  the  Sigma  procedure.  However  the  cases 
where  this  happens  all  correspond  to  cases  where  y  •  H  . 
Closer  examination  of  the  Monte  Carlo  output  revealed 
that  in  these  cases  there  were  several  times  when  m*  , 
the  number  accepted,  was  equal  to  N  ,  i.e.  all  N  items 
were  "accepted."  (See  the  last  column  of  Table  2.)  The 


18 


implication  of  this  is  that  P[V(N)  >  n  •  Nl  1  ?  ,  and 
in  fact  the  probability  most  likely  exceeds  5  by  some 
amount.  This  leads  to  a  higher  expected  value  for  the 
number  of. times  that  V  will  be  at  least  II  •  m  .  In 
each  of  the  four  quantitative  comparisons  the  Sigma 
procedure  compares  quite  favorably  with  the  OCL  procedure. 

Another  comparison  that  can  be  made  using  Monte 
Carlo  studies  is  with  the  fixed  cutoff  score  (kQ)  of  the 
OCL  procedure  and  the  random  cutoff  score  K  of  the  Sigma 
procedure.  Recall  that  in  the  OCL  procedure  an  item  is 
accepted  if  X  <.  +  kQ  ■  kQ  (if  •  0  an<* 

o^  ■  1)  .  In  Table  3  the  values  kQ  and  k  ,  the  average 
cutoff  score  based  on  S00  trials,  are  compared.  Since  X 
is  random,  there  are  times  when  it  is  smaller  than  kQ 
and  other  times  when  it  is  larger.  If  it  is,  on  the  average, 
larger  than  kQ  ,  this  indicates  that  the  acceptance 
cirterion  is  less  stringent.  The  observed  value  of  IT  is 
larger  than  kQ  in  23  out  of  28  cases. 

The  fact  that  K  is  random  and  not  fixed  has  the 
following  implication.  In  repeated  trials,  a  score  of 
x  ■  kQ  ♦  e  would  never  be  accepted  under  the  OCL  procedure 
while  a  score  of  x  -  kQ  -  e  would  always  be  accepted. 
However  in  repeated  trials  using  the  Sigma  procedure,  the 
probability  that  a  score  of  xQ  will  be  accepted  is  a 
non- increasing  function  of  xQ  .  This  is  illustrated  in 
Figure  1.  This  figure  is  based  on  the  empirical  Monte  Carlo 
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results  and  does  not  give  the  exact  distribution  of  K  . 
From  this  we  see  that,  rather  than  there  being  a  strict 
cutoff  (as  in  the  OCL  procedure) ,  the  lower  (better)  the 
x  score  attained  by  an  item  (or  individual)  the  higher 
the  probability  that  the  item  will  be  accepted. 
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Figure  1.  Comparison  of  Cut-off  scores. 
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5.  Conclusions 


There  are  various  situations  under  which  screening 
may  be  done.  Some  examples  are  admissions  tests  for 
educational  placement,  competency  tests  for  employment, 
or  quality  tests  for  a  product.  Based  on  Monte  Carlo 
studies,  the  Sigma  procedure  that  we  have  described  com¬ 
pares  favorably  with  the  procedure  suggested  by  Owen, 
Chen,  and  Li  in  terms  of  meeting  the  conditions  given  in 
(4), 


P[V(m*)  >  £  Cm*)]  >  G  . 

The  comparison  is  also  favorable  in  terms  of  the  number 
of  items  N  which  need  to  be  screened  in  order  to  obtain 
a  specified  number  of  acceptable  items.  The  Sigma 
procedure  may  offer  a  disadvantage  in  that  the  X  measure¬ 
ments  need  to  be  made  on  the  entire  population  of  N 
items  before  it  can  be  determined  which  items  can  be 
accepted.  One  of  the  major  advantages  is  that  the  Sigma 
procedure  does  not  require  any  specialized  tables  to 
implement . 
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Appendix 

Determination  of  a  "Guaranteed”  Acceptable  Score 

In  Section  4  we  pointed  out  that  the  X  measure¬ 
ments  need  to  be  taken  on  the  entire  population  of  N 
objects  before  determining  which  items  in  the  population 
will  be  in  the  acceptable  subset.  In  the  OCL  procedure, 
as  soon  as  the  x  score  for  an  item  is  found,  it  is 
immediately  known  whether  or  not  x  is  less  than  or 
equal  to  kQ  ,  and  hence  whether  or  not  the  item  is  accept¬ 
able.  However  if  we  assume  that  we  know  approximately  how 
many  items  out  of  N  ,  say  m  ,  will  ultimately  be  accept¬ 
able,  and  if  m  is  large  enough  to  use  a  normal  approxima¬ 
tion,  then  we  can  find  an  x  score  (or  equivalently  a 
value  of  p)  for  which  the  corresponding  item  is  almost 
certain  to  be  accepted.  The  most  difficult  case  would  be 
when  all  items  are  essentially  at  the  same  minimal  score, 
so  we  will  let  p  denote  this  common  conditional  proba¬ 
bility.  Using  (6)  with  all  p^  *  p  we  obtain 

P[V(m)  >.  t(m)] 

8  1  -  l[(i(m)  -  .5  -  mp)/ (mp (1  p))1/2]  .  (8) 

If  we  let  t  ■  fc(m)  -  .5  ,  then  the  right  hand  side  of 
(8)  will  be  at  least  ;  if 

U  *  mp)/(mp(l  -  p))1^2  <  z  (9) 

where  z  «  z^_^  denotes  the  100(1  -  c ) %  point  of  the 
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standard  normal  distribution.  Solving  the  inequality  in 
(9)  for  p  via  the  quadratic  formula  yields 


>  1  ♦  Q2/2)  +  I  z  l*f  (1  -  i/m)  *  z2/4; 

m  ♦  z2 


The  corresponding  x  values  can  be  found  by  solving  (3) 


to  get 


<  u  ~  *'1  (p) [1  -  P 
“  P 


2,1/2 


From  the  empirical  Monte  Carlo  studies  these  values  of  p 
and  x  appear  to  be  highly  conservative. 

As  an  example,  if  C  *  .9  ,  p  *  .9  ,  Y“*4»  and 

n  *  .4  ,  and  if  we  assume  that  m  will  be  about  100 
then  we  will  find 


39.5  ♦  0 


.4589 


M8-)2  ♦  1 . 282  [39 . 5 (1  - 


100  ♦  (-1.282)' 


L  -  'i- 


and  the  corresponding  values  of  x  would  be 

X  <  --2533  -  *-^(-*569)  [1  -  (.S>)2]  .  .  „15  _ 

.9 
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